Xiangyu Chang, Danyang Huang, and Hansheng Wang. A Popularity Scaled Latent Space Model for Network Structure Formulation. Statistica Sinica (accepted). 2018
igraph is a collection of network analysis tools with the emphasis on efficiency, portability and ease of use. igraph is open source and free. igraph can be programmed in R, Python and C/C++.
igraph has three basic functionalities.
library(igraph)
g1 <- graph.empty()
g2 <- graph( c(1,2,2,3,3,4,5,6), directed=TRUE )
g3 <- graph.star(10, mode="out")
g4 <- graph.lattice(c(5,5))
g5 <- graph.lattice(length=5, dim=2)
g6 <- graph.ring(10)
g7 <- graph.tree(10, 2)
g8 <- graph.full(5, loops=TRUE)
g9 <- graph.full.citation(10)
g10 <- graph.atlas(sample(0:1252, 1))
el <- matrix( c("foo", "bar", "bar", "foobar"), nc=2, byrow=TRUE)
g11 <- graph.edgelist(el)
g12 <- graph.extended.chordal.ring(15, matrix(c(3,12,4,7,8,11), nr=2))plot(): plot does simple non-interactive 2D plotting to R devices.
tkplot(): does interactive 2D plotting using the tcltk package. It can only handle graphs of moderate size, a thousand vertices is probably already too many.
rglplot(): is an experimental function to draw graphs in 3D using OpenGL.
g2 <- graph( c(1,2,2,3,3,4,5,6), directed=TRUE )
plot(g2)g3 <- graph.star(10, mode="out")
plot(g3)g5 <- graph.lattice(length=5, dim=2)
plot(g5)g6 <- graph.ring(10)
plot(g6)g7 <- graph.tree(10, 2)
plot(g7)g8 <- graph.full(5, loops=TRUE)
plot(g8)g12 <- graph.extended.chordal.ring(15, matrix(c(3,12,4,7,8,11), nr=2))
plot(g12)test <- read.csv('block4.csv',
head = FALSE, stringsAsFactors = FALSE)
g <- graph.data.frame(test,directed = FALSE)
plot(g,vertex.size=5,layout=layout.fruchterman.reingold,vertex.shape='circle', vertex.label.cex=1.0, vertex.label.color='black', vertex.label=NA) #classic random graphs
g13 <- erdos.renyi.game(100,2/100,type='gnp')
plot(g13,layout=layout.fruchterman.reingold,
vertex.size=5,vertex.label=NA)#preferential attachment and variations
g14 <- barabasi.game(100)
plot(g14,layout=layout.fruchterman.reingold,
vertex.size=5,vertex.label=NA,edge.arrow.size=0.1)Plotting parameters
| NODES | 描述 |
|---|---|
| vertex.color | Node color |
| vertex.frame.color | Node border color |
| vertex.shape | One of “none”“circle”“square”“csquare”“rectangle”“crectangle”“vrectangle”“pie”“raster”“sphere” |
| vertex.size | Size of the node (default is 15) |
| vertex.size2 | The second size of the node (e.g. for a rectangle) |
| vertex.label | Character vector used to label the nodes |
| vertex.label.family | Font family of the label (e.g.“Times”, “Helvetica”) |
| vertex.label.font | Font: 1 plain, 2 bold, 3, italic, 4 bold italic, 5 symbol |
| vertex.label.cex | Font size (multiplication factor, device-dependent) |
| vertex.label.dist | Distance between the label and the vertex |
| vertex.label.degree | The position of the label in relation to the vertex where 0 right, “pi” is left, “pi/2” is below, and “-pi/2” is above |
| EDGES | 描述 |
|---|---|
| edge.color | Edge color |
| edge.width | Edge width, defaults to 1 |
| edge.arrow.size | Arrow size, defaults to 1 |
| edge.arrow.width | Arrow width, defaults to 1 |
| edge.lty | Line type, could be 0 or “blank”, 1 or “solid”, 2 or “dashed”, 3 or “dotted”, 4 or “dotdash”, 5 or “longdash”, 6 or “twodash” |
| edge.label | Character vector used to label edges |
| edge.label.family | Font family of the label (e.g.“Times”, “Helvetica”) |
| edge.label.font | Font: 1 plain, 2 bold, 3, italic, 4 bold italic, 5 symbol |
| edge.label.cex | Font size for edge labels |
| edge.curved | Edge curvature, range 0-1 (FALSE sets it to 0, TRUE to 0.5) |
| arrow.mode | Vector specifying whether edges should have arrows,possible values: 0 no arrow, 1 back, 2 forward, 3 both |
| OTHER | 描述 |
|---|---|
| margin | Empty space margins around the plot, vector with length 4 |
| frame | if TRUE, the plot will be framed |
| main | If set, adds a title to the plot |
| sub | If set, adds a subtitle to the plot |
plot(g14, edge.arrow.size=.2,vertex.color="red", vertex.size=8, vertex.frame.color="gray", vertex.label.color="black",vertex.label.cex=0.4, vertex.label.dist=2, edge.curved=0.2) nodes <- read.csv("netscix2016/Dataset1-Media-Example-NODES.csv", header=T, as.is=T)
links <- read.csv("netscix2016/Dataset1-Media-Example-EDGES.csv", header=T, as.is=T)head(nodes)## id media media.type type.label audience.size
## 1 s01 NY Times 1 Newspaper 20
## 2 s02 Washington Post 1 Newspaper 25
## 3 s03 Wall Street Journal 1 Newspaper 30
## 4 s04 USA Today 1 Newspaper 32
## 5 s05 LA Times 1 Newspaper 20
## 6 s06 New York Post 1 Newspaper 50
head(links)## from to weight type
## 1 s01 s02 10 hyperlink
## 2 s01 s02 12 hyperlink
## 3 s01 s03 22 hyperlink
## 4 s01 s04 21 hyperlink
## 5 s04 s11 22 mention
## 6 s05 s15 21 mention
nrow(nodes); length(unique(nodes$id))## [1] 17
## [1] 17
nrow(links); nrow(unique(links[,c("from", "to")]))## [1] 52
## [1] 49
# Collapse multiple links of the same type between the same two nodes
# by summing their weights, using aggregate() by "from", "to", & "type":
# (we don't use "simplify()" here so as not to collapse different link types)
links <- aggregate(links[,3], links[,-3], sum)
links <- links[order(links$from, links$to),]
colnames(links)[4] <- "weight"
rownames(links) <- NULLnodes2 <- read.csv("netscix2016/Dataset2-Media-User-Example-NODES.csv", header=T, as.is=T)
links2 <- read.csv("netscix2016/Dataset2-Media-User-Example-EDGES.csv", header=T, row.names=1)# Examine the data:
head(nodes2)## id media media.type media.name audience.size
## 1 s01 NYT 1 Newspaper 20
## 2 s02 WaPo 1 Newspaper 25
## 3 s03 WSJ 1 Newspaper 30
## 4 s04 USAT 1 Newspaper 32
## 5 s05 LATimes 1 Newspaper 20
## 6 s06 CNN 2 TV 56
head(links2)## U01 U02 U03 U04 U05 U06 U07 U08 U09 U10 U11 U12 U13 U14 U15 U16 U17
## s01 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0
## s02 0 0 0 1 1 0 0 0 0 0 0 0 0 0 0 0 0
## s03 0 0 0 0 0 1 1 1 1 0 0 0 0 0 0 0 0
## s04 0 0 0 0 0 0 0 0 1 1 1 0 0 0 0 0 0
## s05 0 0 0 0 0 0 0 0 0 0 1 1 1 0 0 0 0
## s06 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 0 1
## U18 U19 U20
## s01 0 0 0
## s02 0 0 1
## s03 0 0 0
## s04 0 0 0
## s05 0 0 0
## s06 0 0 0
# links2 is an adjacency matrix for a two-mode network:
links2 <- as.matrix(links2)
dim(links2)## [1] 10 20
dim(nodes2)## [1] 30 5
graph atrribute
Computing features of graphs
Community Detection
Link Prediction
g <- barabasi.game(30)
degree(g)> [1] 13 5 4 1 1 1 5 1 1 2 1 2 1 1 2 1 1 1 1 1 1 2 1
> [24] 2 1 1 1 1 1 1
E(g)> + 29/29 edges from 365327d:
> [1] 2-> 1 3-> 1 4-> 2 5-> 1 6-> 1 7-> 1 8-> 1 9-> 7 10-> 2 11-> 3
> [11] 12-> 7 13-> 1 14-> 2 15->12 16-> 2 17-> 1 18->10 19-> 3 20->15 21-> 1
> [21] 22-> 7 23->22 24-> 1 25-> 3 26-> 1 27-> 7 28-> 1 29->24 30-> 1
V(g)> + 30/30 vertices, from 365327d:
> [1] 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23
> [24] 24 25 26 27 28 29 30
shortest.paths(g, v = 1)> [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11] [,12] [,13]
> [1,] 0 1 1 2 1 1 1 1 2 2 2 2 1
> [,14] [,15] [,16] [,17] [,18] [,19] [,20] [,21] [,22] [,23] [,24]
> [1,] 2 3 2 1 3 2 4 1 2 3 1
> [,25] [,26] [,27] [,28] [,29] [,30]
> [1,] 2 1 2 1 2 1
Centrality: closeness(), betweenness() and page.rank()
Community Detection: walktrap.community(), spinglass.community() and egde.betweenness.community()
Others
karate <- make_graph("Zachary")
wc <- cluster_walktrap(karate)
modularity(wc)## [1] 0.3532216
membership(wc)## [1] 1 1 2 1 5 5 5 1 2 2 5 1 1 2 3 3 5 1 3 1 3 1 3 4 4 4 3 4 2 3 2 2 3 3
plot(wc, karate)